Approximate Retrieval of XML Data with ApproXPath

نویسندگان

  • Lin Xu
  • Curtis E. Dyreson
چکیده

Several XML query languages have been proposed that use XPath expressions to locate data. But XPath expressions might miss some data because of irregularities in the data and schema of an XML data collection. In this paper we propose ApproXPath, which supports approximate path expressions. Approximate path expressions have the same syntax as XPath expressions, but allow content and structural errors. An error is a string or tree edit operation that creates a (virtual) data collection in which the data can be located. ApproXPath extends XPath’s axes, node tests and predicates to utilize the string/tree edit distance. We show that the complexity of ApproXPath is reasonable. For many queries, the inexact matching (with no errors) is as fast as exact matching, and the cost increases linearly with the number of errors allowed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Approximate Tree Embedding for Querying XML Data

Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...

متن کامل

FuzzyXPath: Using Fuzzy Logic an IR Features to Approximately Query XML Documents

XML has become a key technology for interoperability, providing a common data model to applications. However, diverse data modeling choices may lead to heterogeneous XML structure and content. In this paper, information retrieval and database-related techniques have been jointly applied to effectively tolerate XML data diversity in the evaluation of flexible queries. Approximate structure and c...

متن کامل

Information Retrieval of Sequential Data in Heterogeneous XML Databases

The XML language is a W3C standard sustained by both the industry and the scientific community. Therefore, the available information annotated in XML keeps and will keep increasing in size. Nonetheless, not only the volume of the XML information is increasing but also its complexity. The XML documents evolved from plain structured text representations, to documents having complex and heterogene...

متن کامل

Cooperative XML ( CoXML ) Query Answering at INEX 03

The Extensible Markup Language (XML) is becoming the most popular format for information representation and data exchange. Much research has been investigated on providing flexible query facilities while aiming at efficient techniques to extract data from XML documents. However, most of them are focused on only the exact matching of query conditions. In this paper, we describe a cooperative XML...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008